p ˆ (sample mean and sample



Similar documents
Mind on Statistics. Chapter 12

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Chapter 7 Section 7.1: Inference for the Mean of a Population

Point and Interval Estimates

HYPOTHESIS TESTING: POWER OF THE TEST

5.1 Identifying the Target Parameter

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lecture Notes Module 1

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Statistics 2014 Scoring Guidelines

Unit 26 Estimation with Confidence Intervals

Chapter 2. Hypothesis testing in one population

Hypothesis testing - Steps

How To Test For Significance On A Data Set

Study Guide for the Final Exam

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Final Exam Practice Problem Answers

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Fairfield Public Schools

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Simple Linear Regression Inference

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Recall this chart that showed how most of our course would be organized:

Descriptive Statistics

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Independent samples t-test. Dr. Tom Pierce Radford University

University of Chicago Graduate School of Business. Business 41000: Business Statistics

CHAPTER 14 NONPARAMETRIC TESTS

Introduction to Hypothesis Testing OPRE 6301

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

HYPOTHESIS TESTING WITH SPSS:

Name: Date: Use the following to answer questions 3-4:

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

A) B) C) D)

Chi Square Tests. Chapter Introduction

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

STAT 350 Practice Final Exam Solution (Spring 2015)

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Stats Review Chapters 9-10

Review #2. Statistics

Chapter 7 - Practice Problems 1

Estimation of σ 2, the variance of ɛ

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

How To Check For Differences In The One Way Anova

Introduction to Hypothesis Testing

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Premaster Statistics Tutorial 4 Full solutions

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Module 2 Probability and Statistics

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

3.4 Statistical inference for 2 populations based on two samples

Math 251, Review Questions for Test 3 Rough Answers

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Chapter 7 - Practice Problems 2

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Statistics 151 Practice Midterm 1 Mike Kowalski

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Simple Regression Theory II 2010 Samuel L. Baker

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

UNDERSTANDING THE TWO-WAY ANOVA

Two Related Samples t Test

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Non-Parametric Tests (I)

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Week 3&4: Z tables and the Sampling Distribution of X

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Confidence intervals

Hypothesis Testing for Beginners

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Hypothesis Testing. Steps for a hypothesis test:

Interaction between quantitative predictors

AP Stats- Mrs. Daniel Chapter 4 MC Practice

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Mind on Statistics. Chapter 2

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Chapter 23 Inferences About Means

Unit 26: Small Sample Inference for One Mean

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Difference of Means and ANOVA Problems

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Introduction to Quantitative Methods

Practice Problems and Exams

12: Analysis of Variance. Introduction

Independent t- Test (Comparing Two Means)

Transcription:

Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics x, p ˆ (sample mean and sample proportion), we get different answers due to variability. So we have to perform statistical inference: Confidence Interval: when you want to estimate a population parameter Significance Testing: when we want to assess the evidence provided by the data in favor of some claim about the population. Section 6.1: Confidence Intervals allow us to estimate a range of values for the population mean or proportion. The true mean or proportion for the population exists and is a fixed number, but we don t know it! Using sample statistics we get an estimate of where we expect the population parameter to be. If we take a single sample, our single confidence interval net may or may not include the population parameter. However if we take many samples of the same size and create a confidence interval from each sample statistic, over the long run 95% of our confidence intervals will contain the true population parameter (if we are using a 95% confidence level). 1

If you increase your sample size (n), you decrease your margin of error If you increase your confidence level (C), then you increase your margin of error A smaller margin of error is good because we get a smaller range of where to expect the true population parameter. Confidence interval formulas look like estimate margin of error. We write the intervals as (lower bound, upper bound). 2

Confidence Interval for a Population Mean, : x z* x n where z* is the value on the standard normal curve with are C between z* and z*. z* 1.645 1.960 2.576 C 90% 95% 99% (Table D in the back of the book contains more values, but these are the most common) Sample Size, n, for Desired Margin of Error, m: z* n x m 2 Note that it is the sample size, n, that influences the margin of error. The population size has nothing to do with it. Ways to reduce your margin of error: 1.) Increase sample size 2.) Use a lower level of confidence (smaller C) 3.) Reduce x Be careful!!!! You can only use the formula Data must be an SRS from the population. x z* x n under certain circumstances: Do not use if the sampling is anything more complicated than an SRS. Data must be collected correctly (no bias). The margin of error covers only random sampling errors. Undercoverage and nonresponse are not covered. Outliers can have a big effect on the confidence interval. (This makes sense because we use the mean and standard deviation to get a CI.) You must know the standard deviation of the population, x. 3

EXAMPLE 1: A questionnaire of spending habits was given to a random sample of college students. Each student was asked to record and report the amount of money they spent on textbooks in a semester. The sample of 130 students resulted in an average of $422 with standard deviation of $57. a) Give a 90% confidence interval for the mean amount of money spent by college students on textbooks. b) Is it true that 90% of the students spent the amount of money found in the interval from part (a)? Explain your answer. c) What is the margin of error for the 90% confidence interval? d) How many students should you sample if you want a margin of error of $5 for a 90% confidence interval? 4

EXAMPLE 2: A sample of 12 STAT 301 students yields the following Exam 1 scores: 78 62 99 85 94 53 88 90 86 92 75 92 Assume that the population standard deviation is 10. The sample mean can be calculated using SPSS or calculator to be 82.83. (Note: Do NOT use any SPSS confidence intervals they are good only for Chapter 7, not this type of CI. You must get these Z confidence intervals by hand.) a) Find the 90% confidence interval for the mean score for STAT 301 students. b) Find the 95% confidence interval. c) Find the 99% confidence interval. d) How do the margins of error in (b), (c), and (d) change as the confidence level increases? Why? 5

Section 6.2: Hypothesis Testing The 4 steps common to all tests of significance: 1. State the null hypothesis H 0 and the alternative hypothesis H a. 2. Calculate the value of the test statistic. 3. Draw a picture of what H a looks like, and find the P value. 4. State your conclusion about the data in a sentence, using the P value and/or comparing the P value to a significance level for your evidence. STEP 1: State the null hypothesis H 0 and the alternative hypothesis H a. To do a significance test, you need 2 hypotheses: H 0, Null Hypothesis: the statement being tested, usually phrased as no effect or no difference. H a, Alternative Hypothesis: the statement we hope or suspect is true instead of H 0. Hypotheses always refer to some population or model. Not to a particular outcome. Hypotheses can be one sided or two sided. One sided hypothesis: covers just part of the range for your parameter H 0 : = 10 OR H 0 : = 10 H a : > 10 H a : < 10 Two sided hypothesis: covers the whole possible range for your parameter H 0 : = 10 H a : 10 Even though H a is what we hope or believe to be true, our test gives evidence for or against H 0 only. We never prove H 0 true, we can only state whether we have enough evidence to reject H 0 (which is evidence in favor of H a, but not proof that H a is true) or that we don t have enough evidence to reject H 0. 6

Example (Exercise 6.37, p. 418): Each of the following situations requires a significance test about a population mean. State the appropriate null hypothesis H 0 and alternative hypothesis H a in each case: a. Census Bureau data shows that the mean household income in the area served by a shopping mall is $72,500 per year. A market research firm questions shoppers at the mall to find out whether the mean household income of mall shoppers is higher than that of the general population. b. Last year, your company s service technicians took an average of 1.8 hours to respond to trouble calls from business customers who had purchased service contracts. Do this year s data show a different average response time? STEP 2: Calculate the value of the test statistic. A test statistic measures compatibility between the H 0 and the data. The formula for the test statistic will vary between different types of problems. In problems like those we studied in Chapter 6, the test statistic will be the Z score. STEP 3: Draw a picture of what H a looks like, and find the P value. P value: the probability, computed assuming that H 0 is true, that the test statistic would take a value as extreme or more extreme than that actually observed due to random fluctuation. It is a measure of how unusual your sample results are. The smaller the P value, the stronger the evidence against H 0 provided by the data. Calculate the P value by using the sampling distribution of the test statistic (only the normal distribution for Chapter 6). STEP 4: Compare your P value to a significance level. State your conclusion about the data in a sentence. Compare P value to a significance level,. If the P value, we can reject H 0. If you can reject H 0, your results are significant. If you do not reject H 0, your results are not significant. 7

Z Test for a Population Mean To test the hypothesis H 0 : = 0 based on an SRS of size n from a population with unknown mean and known standard deviation, compute the test statistic: Z 0 x 0 / n the P values for a test of H 0 against: H a : > 0 is P( Z Z 0 ) H a : < 0 is P( Z Z 0 ) H a : 0 is 2* P( Z Z 0 ) These P values are exact if the population is normally distributed, and are approximately correct for large n in other cases. 8

EXAMPLES 1. Last year the government made a claim that the average income of the American people was $33,950. However, a sample of 50 people taken recently showed an average income of $34,076 with a population standard deviation of $324. Is the government s estimate too low? Conduct a significance test to see if the true mean is more than the reported average. Use an =0.01. 2. An environmentalist collects a liter of water from 45 different locations along the banks of a stream. He measures the amount of dissolved oxygen in each specimen. The mean oxygen level is 4.62 mg, with the overall standard deviation of 0.92. A water purifying company claims that the mean level of oxygen in the water is 5 mg. Conduct a hypothesis test with =0.001 to determine whether the mean oxygen level is less than 5 mg. 3. An agro economist examines the cellulose content of a variety of alfalfa hay. Suppose that the cellulose content in the population has a standard deviation of 8 mg. A sample of 15 cuttings has a mean cellulose content of 145 mg. A previous study claimed that the mean cellulose content was 140 mg. Perform a hypothesis test to determine if the mean cellulose content is different from 140 mg if =0.05. 9

How does relate to confidence intervals? If you have a 2 sided test, and if the and confidence level add to 100%, you can reject H 0 if 0 (the number you were checking) is not in the confidence interval. a) Find a 95% confidence interval for the mean cellulose content from the above example. b) Now try the test from part number 3 again using the confidence interval from part b to do the hypothesis test. (The result should be the same.) Annual Drinking Water Quality Report, 2004, Town of Brookston, IN I m pleased to report that our drinking water is safe and meets federal and state requirements. Test Results (MCL is the maximum contaminant level, the highest level of a contaminant that is allowed in drinking water.) Contaminant Violation Y/N Level Detected Unit measurement MCL Beta/photon emitters N 2.1 3.2 mrem/yr 4 Alpha emitters N 0 1.6 pci/l 15 Barium N 0.216 ppm 2 Copper N 0.039 to 0.453 ppm 1.3 Fluoride N 0.01 ppm 4 Sodium N 0.0 ppm N/A One of these violation reports should actually be a yes instead of a no. Which one is it and why? What hypotheses go along with these confidence intervals? Note: When I called the town of Brookston office to ask them about this, the water manager called the state EPA office to get more information. What they told him was that, yes, technically I was correct, but that they don t use the confidence intervals that are reported. Apparently these are the FEDERAL EPA rules. They only use the mean. I tried to get sample size or other information, but I wasn t able to learn anything more. 10

P values can be more informative than a reject/do not reject H 0 based on. As P value gets smaller the evidence for rejecting H 0 gets stronger. Just because we use = 0.05 a lot doesn t mean that s the level you have to use it s just the most common. There s nothing particularly special about that level. In a large sample, even tiny deviations from the null hypothesis can be important. If we fail to reject H 0, it may be because H 0 is true or because our sample size is insufficient to detect the alternative. Plot your data and look at your P value to determine your conclusions. Could outliers be part of the problem? A confidence interval actually estimates the size of an effect rather than simply asking if it is too large to reasonably occur by chance alone. You must have a well designed experiment in order for statistical inference to work. Randomization is important. 11